Low-Rank Gradient Descent
نویسندگان
چکیده
Several recent empirical studies demonstrate that important machine learning tasks such as training deep neural networks, exhibit a low-rank structure, where most of the variation in loss function occurs only few directions input space. In this paper, we leverage structure to reduce high computational cost canonical gradient-based methods gradient descent (GD). Our proposed Low-Rank Gradient Descent (LRGD) algorithm finds an $\epsilon$ -approximate stationary point p-dimensional by first identifying notation="LaTeX">$r ≤ p$ significant directions, and then estimating true at every iteration computing directional derivatives along those r directions. We establish “directional oracle complexities” LRGD for strongly convex non-convex objective functions are notation="LaTeX">$\mathcal{O}(r\ {\rm{log}}(1/\epsilon) + rp)$ notation="LaTeX">$\mathcal{O}(r/\epsilon^2 , respectively. Therefore, when notation="LaTeX">$r\ll provides improvement over known complexities notation="LaTeX">$\mathcal{O}(p\ \rm{log}(1/\epsilon)$ ) notation="LaTeX">$\mathcal{O}(p/\epsilon^2)$ GD settings, Furthermore, formally characterize classes exactly approximately functions. Empirically, using real synthetic data, gains data has absence does not degrade performance compared GD. This suggests could be used practice any setting place
منابع مشابه
Nonconvex Low-Rank Matrix Recovery with Arbitrary Outliers via Median-Truncated Gradient Descent
Recent work has demonstrated the effectiveness of gradient descent for directly recovering the factors of low-rank matrices from random linear measurements in a globally convergent manner when initialized properly. However, the performance of existing algorithms is highly sensitive in the presence of outliers that may take arbitrary values. In this paper, we propose a truncated gradient descent...
متن کاملProjected Wirtinger Gradient Descent for Low-Rank Hankel Matrix Completion in Spectral Compressed Sensing
This paper considers reconstructing a spectrally sparse signal from a small number of randomly observed time-domain samples. The signal of interest is a linear combination of complex sinusoids at R distinct frequencies. The frequencies can assume any continuous values in the normalized frequency domain [0, 1). After converting the spectrally sparse signal recovery into a low rank structured mat...
متن کاملNon-Convex Projected Gradient Descent for Generalized Low-Rank Tensor Regression
In this paper, we consider the problem of learning high-dimensional tensor regression problems with low-rank structure. One of the core challenges associated with learning high-dimensional models is computation since the underlying optimization problems are often non-convex. While convex relaxations could lead to polynomialtime algorithms they are often slow in practice. On the other hand, limi...
متن کاملStochastic Variance-reduced Gradient Descent for Low-rank Matrix Recovery from Linear Measurements
We study the problem of estimating low-rank matrices from linear measurements (a.k.a., matrix sensing) through nonconvex optimization. We propose an efficient stochastic variance reduced gradient descent algorithm to solve a nonconvex optimization problem of matrix sensing. Our algorithm is applicable to both noisy and noiseless settings. In the case with noisy observations, we prove that our a...
متن کاملAdaptive Stochastic Gradient Descent on the Grassmannian for Robust Low-Rank Subspace Recovery
In this paper, we present GASG21 (Grassmannian Adaptive Stochastic Gradient for L2,1 norm minimization), an adaptive stochastic gradient algorithm to robustly recover the low-rank subspace from a large matrix. In the presence of column outliers corruption, we reformulate the classical matrix L2,1 norm minimization problem as its stochastic programming counterpart. For each observed data vector,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE open journal of control systems
سال: 2023
ISSN: ['2694-085X']
DOI: https://doi.org/10.1109/ojcsys.2023.3315088